Agnostic Distribution Learning via Compression
نویسندگان
چکیده
We prove that Θ̃(kd2/ε2) samples are necessary and sufficient for learning a mixture of k Gaussians in Rd, up to error ε in total variation distance. This improves both the known upper bound and lower bound for this problem. For mixtures of axis-aligned Gaussians, we show that Õ(kd/ε2) samples suffice, matching a known lower bound. Moreover, these results hold in an agnostic learning setting as well. The upper bound is based on a novel technique for distribution learning based on a notion of sample compression. Any class of distributions that allows such a sample compression scheme can also be learned with few samples. Moreover, if a class of distributions has such a compression scheme, then so do the classes of products and mixtures of those distributions. The core of our main result is showing that the class of Gaussians in Rd has an efficient sample compression.
منابع مشابه
Distribution-Specific Agnostic Boosting
We consider the problem of boosting the accuracy of weak learning algorithms in the agnostic learning framework of Haussler (1992) and Kearns et al. (1992). Known algorithms for this problem (BenDavid et al., 2001; Gavinsky, 2002; Kalai et al. , 2008) follow the same strategy as boosting algorithms in the PAC model: the weak learner is executed on the same target function but over different dis...
متن کاملOn The Power of Membership Queries in Agnostic Learning On The Power of Membership Queries in Agnostic Learning∗
We study the properties of the agnostic learning framework of Haussler (1992) and Kearns, Schapire, and Sellie (1994). In particular, we address the question: is there any situation in which membership queries are useful in agnostic learning? Our results show that the answer is negative for distribution-independent agnostic learning and positive for agnostic learning with respect to a specific ...
متن کاملOn the Power of Membership Queries in Agnostic Learning
We study the properties of the agnostic learning framework of Haussler [Hau92] and Kearns, Schapire and Sellie [KSS94]. In particular, we address the question: is there any situation in which membership queries are useful in agnostic learning? Our results show that the answer is negative for distribution-independent agnostic learning and positive for agnostic learning with respect to a specific...
متن کاملSufficient Conditions for Agnostic Active Learnable
We study pool-based active learning in the presence of noise, i.e. the agnostic setting. Previous works have shown that the effectiveness of agnostic active learning depends on the learning problem and the hypothesis space. Although there are many cases on which active learning is very useful, it is also easy to construct examples that no active learning algorithm can have advantage. In this pa...
متن کاملAgnostic Boosting
We extend the boosting paradigm to the realistic setting of agnostic learning, that is, to a setting where the training sample is generated by an arbitrary (unknown) probability distribution over examples and labels. We deene a-weak agnostic learner with respect to a hypothesis class F as follows: given a distribution P it outputs some hypothesis h 2 F whose error is at most erP(F) + , where er...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1710.05209 شماره
صفحات -
تاریخ انتشار 2017